Tabulation Based 5-Universal Hashing and Linear Probing

نویسندگان

Mikkel Thorup

Yin Zhang

چکیده

Previously [SODA’04] we devised the fastest known algorithm for 4-universal hashing. The hashing was based on small pre-computed 4-universal tables. This led to a five-fold improvement in speed over direct methods based on degree 3 polynomials. In this paper, we show that if the pre-computed tables are made 5-universal, then the hash value becomes 5universal without any other change to the computation. Relatively this leads to even bigger gains since the direct methods for 5-universal hashing use degree 4 polynomials. Experimentally, we find that our method can gain up to an order of magnitude in speed over direct 5-universal hashing. Some of the most popular randomized algorithms have been proved to have the desired expected running time using 5-universal hashing, e.g., a non-recursive variant of quicksort takes O(n log n) expected time [Karloff Raghavan JACM’93], and linear probing does updates and searches in O(1) expected time [Pagh et al. SICOMP’09]. In contrast, inputs have been constructed leading to muchworse expected performance with some of the classic primality based 2-universal hashing schemes. In the context of linear probing, we compare our new fast 5-universal hashing experimentally with the fastest known plain universal hashing. We know that any reasonable hashing scheme will work on random input, but from Pagh et al., we know that 5-universal hashing leads to good expected performance on all input. We use a dense interval as an example of a structured yet realistic input, wanting to see if this could push the fastest multiplication-shift based plain universal hashing into bad performance. Even though our 5universal hashing itself is slower than the fast plain universal hashing, it makes linear probing much more robust.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

String hashing for linear probing

Linear probing is one of the most popular implementations of dynamic hash tables storing all keys in a single array. When we get a key, we first hash it to a location. Next we probe consecutive locations until the key or an empty location is found. At STOC’07, Pagh et al. presented data sets where the standard implementation of 2-universal hashing leads to an expected number of Ω(log n) probes....

متن کامل

Linear Probing with 5-wise Independence

Hashing with linear probing dates back to the 1950s, and is among the most studied algorithms for storing (key,value) pairs. In recent years it has become one of the most important hash table organizations since it uses the cache of modern computers very well. Unfortunately, previous analyses rely either on complicated and space consuming hash functions, or on the unrealistic assumption of free...

متن کامل

Tabulation-Based 5-Independent Hashing with Applications to Linear Probing and Second Moment Estimation

In the framework of Carter and Wegman, a k-independent hash function maps any k keys independently. It is known that 5independent hashing provides good expected performance in applications such as linear probing and second moment estimation for data streams. The classic 5-independent hash function evaluates a degree 4 polynomial over a prime field containing the key domain [n] = {0, . . . , n −...

متن کامل

Lecture 10 — March 20 , 2012

In the last lecture, we finished up talking about memory hierarchies and linked cache-oblivious data structures with geometric data structures. In this lecture we talk about different approaches to hashing. First, we talk about different hash functions and their properties, from basic universality to k-wise independence to a simple but effective hash function called simple tabulation. Then, we ...

متن کامل